Goto

Collaborating Authors

 adsorption energy


Catalyst GFlowNet for electrocatalyst design: A hydrogen evolution reaction case study

Podina, Lena, Humer, Christina, Duval, Alexandre, Schmidt, Victor, Ramlaoui, Ali, Chatterjee, Shahana, Bengio, Yoshua, Hernandez-Garcia, Alex, Rolnick, David, Therrien, Félix

arXiv.org Artificial Intelligence

Efficient and inexpensive energy storage is essential for accelerating the adoption of renewable energy and ensuring a stable supply, despite fluctuations in sources such as wind and solar. Electrocatalysts play a key role in hydrogen energy storage (HES), allowing the energy to be stored as hydrogen. However, the development of affordable and high-performance catalysts for this process remains a significant challenge. We introduce Catalyst GFlowNet, a generative model that leverages machine learning-based predictors of formation and adsorption energy to design crystal surfaces that act as efficient catalysts. We demonstrate the performance of the model through a proof-of-concept application to the hydrogen evolution reaction, a key reaction in HES, for which we successfully identified platinum as the most efficient known catalyst. In future work, we aim to extend this approach to the oxygen evolution reaction, where current optimal catalysts are expensive metal oxides, and open the search space to discover new materials. This generative modeling framework offers a promising pathway for accelerating the search for novel and efficient catalysts.


UMA: A Family of Universal Models for Atoms

Wood, Brandon M., Dzamba, Misko, Fu, Xiang, Gao, Meng, Shuaibi, Muhammed, Barroso-Luque, Luis, Abdelmaqsoud, Kareem, Gharakhanyan, Vahe, Kitchin, John R., Levine, Daniel S., Michel, Kyle, Sriram, Anuroop, Cohen, Taco, Das, Abhishek, Rizvi, Ammar, Sahoo, Sushree Jagriti, Ulissi, Zachary W., Zitnick, C. Lawrence

arXiv.org Artificial Intelligence

The ability to quickly and accurately compute properties from atomic simulations is critical for advancing a large number of applications in chemistry and materials science including drug discovery, energy storage, and semiconductor manufacturing. To address this need, Meta FAIR presents a family of Universal Models for Atoms (UMA), designed to push the frontier of speed, accuracy, and generalization. UMA models are trained on half a billion unique 3D atomic structures (the largest training runs to date) by compiling data across multiple chemical domains, e.g. molecules, materials, and catalysts. We develop empirical scaling laws to help understand how to increase model capacity alongside dataset size to achieve the best accuracy. The UMA small and medium models utilize a novel architectural design we refer to as mixture of linear experts that enables increasing model capacity without sacrificing speed. For example, UMA-medium has 1.4B parameters but only ~50M active parameters per atomic structure. We evaluate UMA models on a diverse set of applications across multiple domains and find that, remarkably, a single model without any fine-tuning can perform similarly or better than specialized models. We are releasing the UMA code, weights, and associated data to accelerate computational workflows and enable the community to continue to build increasingly capable AI models.


Transition States Energies from Machine Learning: An Application to Reverse Water-Gas Shift on Single-Atom Alloys

Cheula, Raffaele, Andersen, Mie

arXiv.org Artificial Intelligence

Obtaining accurate transition state (TS) energies is a bottleneck in computational screening of complex materials and reaction networks due to the high cost of TS search methods and first-principles methods such as density functional theory (DFT). Here we propose a machine learning (ML) model for predicting TS energies based on Gaussian process regression with the Wasserstein Weisfeiler-Lehman graph kernel (WWL-GPR). Applying the model to predict adsorption and TS energies for the reverse water-gas shift (RWGS) reaction on single-atom alloy (SAA) catalysts, we show that it can significantly improve the accuracy compared to traditional approaches based on scaling relations or ML models without a graph representation. Further benefitting from the low cost of model training, we train an ensemble of WWL-GPR models to obtain uncertainties through subsampling of the training data and show how these uncertainties propagate to turnover frequency (TOF) predictions through the construction of an ensemble of microkinetic models. Comparing the errors in model-based vs DFT-based TOF predictions, we show that the WWL-GPR model reduces errors by almost an order of magnitude compared to scaling relations. This demonstrates the critical impact of accurate energy predictions on catalytic activity estimation. Finally, we apply our model to screen new materials, identifying promising catalysts for RWGS. This work highlights the power of combining advanced ML techniques with DFT and microkinetic modeling for screening catalysts for complex reactions like RWGS, providing a robust framework for future catalyst design.


Adsorb-Agent: Autonomous Identification of Stable Adsorption Configurations via Large Language Model Agent

Ock, Janghoon, Vinchurkar, Tirtha, Jadhav, Yayati, Farimani, Amir Barati

arXiv.org Artificial Intelligence

Adsorption energy is a key reactivity descriptor in catalysis, enabling efficient screening for optimal catalysts. However, determining adsorption energy typically requires evaluating numerous adsorbate-catalyst configurations. Current algorithmic approaches rely on exhaustive enumeration of adsorption sites and configurations, which makes the process computationally intensive and does not inherently guarantee the identification of the global minimum energy. In this work, we introduce Adsorb-Agent, a Large Language Model (LLM) agent designed to efficiently identify system-specific stable adsorption configurations corresponding to the global minimum adsorption energy. Adsorb-Agent leverages its built-in knowledge and emergent reasoning capabilities to strategically explore adsorption configurations likely to hold adsorption energy. By reducing the reliance on exhaustive sampling, it significantly decreases the number of initial configurations required while improving the accuracy of adsorption energy predictions. We evaluate Adsorb-Agent's performance across twenty representative systems encompassing a range of complexities. The Adsorb-Agent successfully identifies comparable adsorption energies for 83.7% of the systems and achieves lower energies, closer to the actual global minimum, for 35% of the systems, while requiring significantly fewer initial configurations than conventional methods. Its capability is particularly evident in complex systems, where it identifies lower adsorption energies for 46.7% of systems involving intermetallic surfaces and 66.7% of systems with large adsorbate molecules. These results demonstrate the potential of Adsorb-Agent to accelerate catalyst discovery by reducing computational costs and improving the reliability of adsorption energy predictions.


Explainable Data-driven Modeling of Adsorption Energy in Heterogeneous Catalysis

Vinchurkar, Tirtha, Ock, Janghoon, Farimani, Amir Barati

arXiv.org Artificial Intelligence

The increasing popularity of machine learning (ML) in catalysis has spurred interest in leveraging these techniques to enhance catalyst design. Our study aims to bridge the gap between physics-based studies and data-driven methodologies by integrating ML techniques with eXplainable AI (XAI). Specifically, we employ two XAI techniques: Post-hoc XAI analysis and Symbolic Regression. These techniques help us unravel the correlation between adsorption energy and the properties of the adsorbate-catalyst system. Leveraging a large dataset such as the Open Catalyst Dataset (OC20), we employ a combination of shallow ML techniques and XAI methodologies. Our investigation involves utilizing multiple shallow machine learning techniques to predict adsorption energy, followed by post-hoc analysis for feature importance, inter-feature correlations, and the influence of various feature values on the prediction of adsorption energy. The post-hoc analysis reveals that adsorbate properties exert a greater influence than catalyst properties in our dataset. The top five features based on higher Shapley values are adsorbate electronegativity, the number of adsorbate atoms, catalyst electronegativity, effective coordination number, and the sum of atomic numbers of the adsorbate molecule. There is a positive correlation between catalyst and adsorbate electronegativity with the prediction of adsorption energy. Additionally, symbolic regression yields results consistent with SHAP analysis. It deduces a mathematical relationship indicating that the square of the catalyst electronegativity is directly proportional to the adsorption energy. These consistent correlations resemble those derived from physics-based equations in previous research. Our work establishes a robust framework that integrates ML techniques with XAI, leveraging large datasets like OC20 to enhance catalyst design through model explainability.


AdsorbDiff: Adsorbate Placement via Conditional Denoising Diffusion

Kolluru, Adeesh, Kitchin, John R

arXiv.org Artificial Intelligence

Determining the optimal configuration of adsorbates on a slab (adslab) is pivotal in the exploration of novel catalysts across diverse applications. Traditionally, the quest for the lowest energy adslab configuration involves placing the adsorbate onto the slab followed by an optimization process. Prior methodologies have relied on heuristics, problem-specific intuitions, or brute-force approaches to guide adsorbate placement. In this work, we propose a novel framework for adsorbate placement using denoising diffusion. The model is designed to predict the optimal adsorbate site and orientation corresponding to the lowest energy configuration. Further, we have an end-to-end evaluation framework where diffusion-predicted adslab configuration is optimized with a pretrained machine learning force field and finally evaluated with Density Functional Theory (DFT). Our findings demonstrate an acceleration of up to 5x or 3.5x improvement in accuracy compared to the previous best approach. Given the novelty of this framework and application, we provide insights into the impact of pre-training, model architectures, and conduct extensive experiments to underscore the significance of this approach.


Adapting OC20-trained EquiformerV2 Models for High-Entropy Materials

Clausen, Christian M., Rossmeisl, Jan, Ulissi, Zachary W.

arXiv.org Artificial Intelligence

Computational high-throughput studies, especially in research on high-entropy materials and catalysts, are hampered by high-dimensional composition spaces and myriad structural microstates. They present bottlenecks to the conventional use of density functional theory calculations, and consequently, the use of machine-learned potentials is becoming increasingly prevalent in atomic structure simulations. In this communication, we show the results of adjusting and fine-tuning the pretrained EquiformerV2 model from the Open Catalyst Project to infer adsorption energies of *OH and *O on the out-of-domain high-entropy alloy Ag-Ir-Pd-Pt-Ru. By applying an energy filter based on the local environment of the binding site the zero-shot inference is markedly improved and through few-shot fine-tuning the model yields state-of-the-art accuracy. It is also found that EquiformerV2, assuming the role of general machine learning potential, is able to inform a smaller, more focused direct inference model. This knowledge distillation setup boosts performance on complex binding sites. Collectively, this shows that foundational knowledge learned from ordered intermetallic structures, can be extrapolated to the highly disordered structures of solid-solutions. With the vastly accelerated computational throughput of these models, hitherto infeasible research in the high-entropy material space is now readily accessible.


AdsorbRL: Deep Multi-Objective Reinforcement Learning for Inverse Catalysts Design

Lacombe, Romain, Hendren, Lucas, El-Awady, Khalid

arXiv.org Artificial Intelligence

A central challenge of the clean energy transition is the development of catalysts for low-emissions technologies. Recent advances in Machine Learning for quantum chemistry drastically accelerate the computation of catalytic activity descriptors such as adsorption energies. Here we introduce AdsorbRL, a Deep Reinforcement Learning agent aiming to identify potential catalysts given a multi-objective binding energy target, trained using offline learning on the Open Catalyst 2020 and Materials Project data sets. We experiment with Deep Q-Network agents to traverse the space of all ~160,000 possible unary, binary and ternary compounds of 55 chemical elements, with very sparse rewards based on adsorption energy known for only between 2,000 and 3,000 catalysts per adsorbate. To constrain the actions space, we introduce Random Edge Traversal and train a single-objective DQN agent on the known states subgraph, which we find strengthens target binding energy by an average of 4.1 eV. We extend this approach to multi-objective, goal-conditioned learning, and train a DQN agent to identify materials with the highest (respectively lowest) adsorption energies for multiple simultaneous target adsorbates. We experiment with Objective Sub-Sampling, a novel training scheme aimed at encouraging exploration in the multi-objective setup, and demonstrate simultaneous adsorption energy improvement across all target adsorbates, by an average of 0.8 eV. Overall, our results suggest strong potential for Deep Reinforcement Learning applied to the inverse catalysts design problem.


The Open DAC 2023 Dataset and Challenges for Sorbent Discovery in Direct Air Capture

Sriram, Anuroop, Choi, Sihoon, Yu, Xiaohan, Brabson, Logan M., Das, Abhishek, Ulissi, Zachary, Uyttendaele, Matt, Medford, Andrew J., Sholl, David S.

arXiv.org Artificial Intelligence

New methods for carbon dioxide removal are urgently needed to combat global climate change. Direct air capture (DAC) is an emerging technology to capture carbon dioxide directly from ambient air. Metal-organic frameworks (MOFs) have been widely studied as potentially customizable adsorbents for DAC. However, discovering promising MOF sorbents for DAC is challenging because of the vast chemical space to explore and the need to understand materials as functions of humidity and temperature. We explore a computational approach benefiting from recent innovations in machine learning (ML) and present a dataset named Open DAC 2023 (ODAC23) consisting of more than 38M density functional theory (DFT) calculations on more than 8,400 MOF materials containing adsorbed $CO_2$ and/or $H_2O$. ODAC23 is by far the largest dataset of MOF adsorption calculations at the DFT level of accuracy currently available. In addition to probing properties of adsorbed molecules, the dataset is a rich source of information on structural relaxation of MOFs, which will be useful in many contexts beyond specific applications for DAC. A large number of MOFs with promising properties for DAC are identified directly in ODAC23. We also trained state-of-the-art ML models on this dataset to approximate calculations at the DFT level. This open-source dataset and our initial ML models will provide an important baseline for future efforts to identify MOFs for a wide range of applications, including DAC.


AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials

Lan, Janice, Palizhati, Aini, Shuaibi, Muhammed, Wood, Brandon M., Wander, Brook, Das, Abhishek, Uyttendaele, Matt, Zitnick, C. Lawrence, Ulissi, Zachary W.

arXiv.org Artificial Intelligence

Computational catalysis is playing an increasingly significant role in the design of catalysts across a wide range of applications. A common task for many computational methods is the need to accurately compute the adsorption energy for an adsorbate and a catalyst surface of interest. Traditionally, the identification of low energy adsorbate-surface configurations relies on heuristic methods and researcher intuition. As the desire to perform high-throughput screening increases, it becomes challenging to use heuristics and intuition alone. In this paper, we demonstrate machine learning potentials can be leveraged to identify low energy adsorbate-surface configurations more accurately and efficiently. Our algorithm provides a spectrum of trade-offs between accuracy and efficiency, with one balanced option finding the lowest energy configuration 87.36% of the time, while achieving a 2000x speedup in computation. To standardize benchmarking, we introduce the Open Catalyst Dense dataset containing nearly 1,000 diverse surfaces and 100,000 unique configurations.